4 research outputs found

    Polyhedral optimizations of RNA-RNA interaction computations

    Get PDF
    2017 Fall.Includes bibliographical references.Studying RNA-RNA interaction has led to major successes in the treatment of some cancers, including colon, breast and pancreatic cancer by suppressing the gene expression involved in the development of these diseases. The problem with such programs is that they are computationally and memory intensive: O(N4) space and O(N6) time complexity. Moreover, the entire application is complicated, and involves many mutually recursive data variables. We address the problem of speeding up a surrogate kernel (named OSPSQ) that captures the main dependence pattern found in two widely used RNA-RNA interaction applications IRIS and piRNA. The structure of the OSPSQ kernel perfectly fits the constraints of the polyhedral model, a well-developed technology for optimizing codes that belong to many specialized domains. However, the current state-of-the-art automatic polyhedral tools do not significantly improve the performance of the baseline implementation of OSPSQ. With simple techniques like loop permutation and skewing, we achieve an average of 17x sequential and 31x parallel speedup on a standard modern multi-core platform (Intel Broadwell, E5-1650v4). This performance represents 75% and 88% of attainable single-core and multi-core L1 bandwidth. For further performance improvement, we describe how to tile all six dimensions and also formulate the associated memory trade-off. In the future, we plan to implement these tiling strategies, explore the performance of the code for various tile sizes and optimize the whole piRNA application

    A Parallel Ensemble of Metaheuristic Solvers for the Traveling Salesman Problem

    Full text link
    The travelling salesman problem (TSP) is one of the well-studied NP-hard problems in the literature. The state-of-the art inexact TSP solvers are the Lin-Kernighan-Helsgaun (LKH) heuristic and Edge Assembly crossover (EAX). A recent study suggests that EAX with restart mechanisms perform well on a wide range of TSP instances. However, this study is limited to 2,000 city problems. We study for problems ranging from 2,000 to 85,900. We see that the performance of the solver varies with the type of the problem. However, combining these solvers in an ensemble setup, we are able to outperform the individual solver's performance. We see the ensemble setup as an efficient way to make use of the abundance of compute resources. In addition to EAX and LKH, we use several versions of the hybrid of EAX and Mixing Genetic Algorithm (MGA). A hybrid of MGA and EAX is known to solve some hard problems. We see that the ensemble of the hybrid version outperforms the state-of-the-art solvers on problems larger than 10,000 cities.Comment: First submission was made to Europar, 2021. Paper Rejecte

    Dataset associated with "Polyhedral optimizations of RNA-RNA interaction computations"

    No full text
    These files can be used to re-create the results in the thesis manuscript: "Polyhedral Optimizations of RNA-RNA Interaction Computations". One will need to use the AlphaZ tool (http://www.cs.colostate.edu/AlphaZ/wiki/doku.php ) to produce result from the .ab and .cs file. For users not aware of AlphaZ can use the C codes.Studying RNA-RNA interaction has led to major successes in the treatment of some cancers, including colon, breast and pancreatic cancer by suppressing the gene expression involved in the development of these diseases. The problem with such programs is that they are computationally and memory intensive: O(N4) space and O(N6) time complexity. Moreover, the entire applicationis complicated and involves many mutually recursive data variables. We address the problem of speeding up a surrogate kernel (named OSPSQ) that captures the main dependence pattern foundin two widely used RNA-RNA interaction applications- IRIS and piRNA. The structure of the OSPSQ kernel perfectly fits the constraints of the polyhedral model, a well-developed technology for optimizing codes that belong to many specialized domains. However, the current state-of-the-art automatic polyhedral tools do not significantly improve the performance of the baseline implementation of OSPSQ. With simple techniques like loop permutation and skewing, we achieve an average of 17x sequential and 31x parallel speedup on a standard modern multi-core platform (Intel Broadwell, E5-1650v4). This performance represents 75% and 88% of attainable single-core and multi-core L1 bandwidth. For further performance improvement, we describe how to tile all six dimensions and also formulate the associated memory trade-off. In the future, we plan to implement these tiling strategies, explore the performance of the code for various tile sizes and optimize the whole piRNA application

    Mixing genetic algorithm for traveling salesman problem, The

    No full text
    2022 Spring.Includes bibliographical references.The Traveling Salesman Problem (TSP) is one of the most intensively studied NP-Hard problems. The TSP solvers are well-suited for multi-core CPU-based architectures. With the decline in Moore's law, there is an increasing need to port the codes to parallel architectures such as the GPU massively. This thesis focuses on the Genetic Algorithm (GA) based TSP solvers. The major drawback in porting the state-of-the-art GA based TSP solver (called the Edge Assembly Crossover (EAX)) are (a) the memory per crossover operation is large and limits the scalability of the solver (b) the communication per crossover operation is random and not favorable for the SIMD machines. We designed a new solver, the Mixing Genetic Algorithm (MGA), using the Generalized Partition Crossover (GPX) operator to overcome these aspects. The GPX consumes 4 x lesser memory and does not access the memory during crossover operation. The MGA is used in three different modes. (1) MGA can converge fast on problems smaller than 2,000 cities as a single solver. (2) As a hybrid solver, together with EAX, it speeds up the convergence rate for problems up to 85,900 cities. (3) In an ensemble setting, together with EAX and an iterated local search (called the Lin-Kernighan Helsgaun (LKH) heuristic), it increases the success rate of some of the hard TSP instances. The MGA is parallelized on shared memory (using OpenMP), distributed memory (using MPI), and GPU (using CUDA). A combination of OpenMP and MPI parallelization is examined on problems ranging between 5,000 to 85,900 cities. We show near-linear speedup (proportional to the number of parallel units) on these instances. Preliminary results on GPU parallelization of the GPX crossover operator partition phase show a 48x to 625x speedup over the naive sequential implementation. This is the first step towards the fine-grain parallelization of GA operators for TSP. The results are tested on problems ranging from 10,000 to 2M cities
    corecore